人脸生成(Face Generation)

在该项目中,你将使用生成式对抗网络(Generative Adversarial Nets)来生成新的人脸图像。

获取数据

该项目将使用以下数据集:

  • MNIST
  • CelebA

由于 CelebA 数据集比较复杂,而且这是你第一次使用 GANs。我们想让你先在 MNIST 数据集上测试你的 GANs 模型,以让你更快的评估所建立模型的性能。

如果你在使用 FloydHub, 请将 data_dir 设置为 "/input" 并使用 FloydHub data ID "R5KrjnANiKVhLWAkpXhNBe".

In [1]:
data_dir = './data'

# FloydHub - Use with data ID "R5KrjnANiKVhLWAkpXhNBe"
#data_dir = '/input'


"""
DON'T MODIFY ANYTHING IN THIS CELL
"""
import helper

helper.download_extract('mnist', data_dir)
helper.download_extract('celeba', data_dir)
Found mnist Data
Found celeba Data

探索数据(Explore the Data)

MNIST

MNIST 是一个手写数字的图像数据集。你可以更改 show_n_images 探索此数据集。

In [2]:
show_n_images = 25

"""
DON'T MODIFY ANYTHING IN THIS CELL
"""
%matplotlib inline
import os
from glob import glob
from matplotlib import pyplot

mnist_images = helper.get_batch(glob(os.path.join(data_dir, 'mnist/*.jpg'))[:show_n_images], 28, 28, 'L')
pyplot.imshow(helper.images_square_grid(mnist_images, 'L'), cmap='gray')
Out[2]:
<matplotlib.image.AxesImage at 0x7f8833f59470>

CelebA

CelebFaces Attributes Dataset (CelebA) 是一个包含 20 多万张名人图片及相关图片说明的数据集。你将用此数据集生成人脸,不会用不到相关说明。你可以更改 show_n_images 探索此数据集。

In [3]:
show_n_images = 25

"""
DON'T MODIFY ANYTHING IN THIS CELL
"""
mnist_images = helper.get_batch(glob(os.path.join(data_dir, 'img_align_celeba/*.jpg'))[:show_n_images], 28, 28, 'RGB')
pyplot.imshow(helper.images_square_grid(mnist_images, 'RGB'))
Out[3]:
<matplotlib.image.AxesImage at 0x7f8833e87048>

预处理数据(Preprocess the Data)

由于该项目的重点是建立 GANs 模型,我们将为你预处理数据。

经过数据预处理,MNIST 和 CelebA 数据集的值在 28×28 维度图像的 [-0.5, 0.5] 范围内。CelebA 数据集中的图像裁剪了非脸部的图像部分,然后调整到 28x28 维度。

MNIST 数据集中的图像是单通道的黑白图像,CelebA 数据集中的图像是 三通道的 RGB 彩色图像

建立神经网络(Build the Neural Network)

你将通过部署以下函数来建立 GANs 的主要组成部分:

  • model_inputs
  • discriminator
  • generator
  • model_loss
  • model_opt
  • train

检查 TensorFlow 版本并获取 GPU 型号

检查你是否使用正确的 TensorFlow 版本,并获取 GPU 型号

In [4]:
"""
DON'T MODIFY ANYTHING IN THIS CELL
"""
from distutils.version import LooseVersion
import warnings
import tensorflow as tf

# Check TensorFlow Version
assert LooseVersion(tf.__version__) >= LooseVersion('1.0'), 'Please use TensorFlow version 1.0 or newer.  You are using {}'.format(tf.__version__)
print('TensorFlow Version: {}'.format(tf.__version__))

# Check for a GPU
if not tf.test.gpu_device_name():
    warnings.warn('No GPU found. Please use a GPU to train your neural network.')
else:
    print('Default GPU Device: {}'.format(tf.test.gpu_device_name()))
TensorFlow Version: 1.0.0
Default GPU Device: /gpu:0

输入(Input)

部署 model_inputs 函数以创建用于神经网络的 占位符 (TF Placeholders)。请创建以下占位符:

  • 输入图像占位符: 使用 image_widthimage_heightimage_channels 设置为 rank 4。
  • 输入 Z 占位符: 设置为 rank 2,并命名为 z_dim
  • 学习速率占位符: 设置为 rank 0。

返回占位符元组的形状为 (tensor of real input images, tensor of z data, learning rate)。

In [5]:
import problem_unittests as tests

def model_inputs(image_width, image_height, image_channels, z_dim):
    """
    Create the model inputs
    :param image_width: The input image width
    :param image_height: The input image height
    :param image_channels: The number of image channels
    :param z_dim: The dimension of Z
    :return: Tuple of (tensor of real input images, tensor of z data, learning rate)
    """
    input_images = tf.placeholder(tf.float32, (None,image_width, image_height, image_channels))
    input_z = tf.placeholder(tf.float32, (None, z_dim), name="z_dim")
    learning_rate = tf.placeholder(tf.float32)

    return (input_images, input_z, learning_rate)


"""
DON'T MODIFY ANYTHING IN THIS CELL THAT IS BELOW THIS LINE
"""
tests.test_model_inputs(model_inputs)
Tests Passed

辨别器(Discriminator)

部署 discriminator 函数创建辨别器神经网络以辨别 images。该函数应能够重复使用神经网络中的各种变量。 在 tf.variable_scope 中使用 "discriminator" 的变量空间名来重复使用该函数中的变量。

该函数应返回形如 (tensor output of the discriminator, tensor logits of the discriminator) 的元组。

In [6]:
def discriminator(images, reuse=False):
    """
    Create the discriminator network
    :param image: Tensor of input image(s)
    :param reuse: Boolean if the weights should be reused
    :return: Tuple of (tensor output of the discriminator, tensor logits of the discriminator)
    """
    alpha = 0.01
    with tf.variable_scope('discriminator', reuse=reuse):
        # 28*28
        x1 = tf.layers.conv2d(images, 64, 5, strides=2, padding='same')
        relu1 = tf.maximum(alpha * x1, x1)
        # 14x14x64
        
        x2 = tf.layers.conv2d(relu1, 128, 5, strides=2, padding='same')
        bn2 = tf.layers.batch_normalization(x2, training=True)
        relu2 = tf.maximum(alpha * bn2, bn2)
        dropout2 = tf.nn.dropout(relu2, keep_prob=0.5)
        # 7x7x128
        
        x3 = tf.layers.conv2d(dropout2, 256, 5, strides=2, padding='same')
        bn3 = tf.layers.batch_normalization(x3, training=True)
        relu3 = tf.maximum(alpha * bn3, bn3)
        dropout3 = tf.nn.dropout(relu3, keep_prob=0.5)
        # 4x4x256

        # Flatten it
        flat = tf.reshape(dropout3, (-1, 4*4*256))
        logits = tf.layers.dense(flat, 1)
        out = tf.sigmoid(logits)
        
        return out, logits


"""
DON'T MODIFY ANYTHING IN THIS CELL THAT IS BELOW THIS LINE
"""
tests.test_discriminator(discriminator, tf)
Tests Passed

生成器(Generator)

部署 generator 函数以使用 z 生成图像。该函数应能够重复使用神经网络中的各种变量。 在 tf.variable_scope 中使用 "generator" 的变量空间名来重复使用该函数中的变量。

该函数应返回所生成的 28 x 28 x out_channel_dim 维度图像。

In [7]:
def generator(z, out_channel_dim, is_train=True):
    """
    Create the generator network
    :param z: Input z
    :param out_channel_dim: The number of channels in the output image
    :param is_train: Boolean if generator is being used for training
    :return: The tensor output of the generator
    """
    alpha = 0.01
    with tf.variable_scope('generator',reuse=not is_train):
        x1 = tf.layers.dense(z, 7*7*512)
        
        x1 = tf.reshape(x1, (-1, 7, 7, 512))
        x1 = tf.layers.batch_normalization(x1, training=is_train)
        x1 = tf.maximum(alpha * x1, x1)
        # 7x7x512
        
        x2 = tf.layers.conv2d_transpose(x1, 256, 5, strides=2, padding='same')
        x2 = tf.layers.batch_normalization(x2, training=is_train)
        x2 = tf.maximum(alpha * x2, x2)
        # 14x14x256
        
        logits = tf.layers.conv2d_transpose(x2, out_channel_dim, 5, strides=2, padding='same')
        # 28x28x3 
        
        out = 0.5 * tf.tanh(logits)
        
        return out


"""
DON'T MODIFY ANYTHING IN THIS CELL THAT IS BELOW THIS LINE
"""
tests.test_generator(generator, tf)
Tests Passed

损失函数(Loss)

部署 model_loss 函数训练并计算 GANs 的损失。该函数应返回形如 (discriminator loss, generator loss) 的元组。

使用你已实现的函数:

  • discriminator(images, reuse=False)
  • generator(z, out_channel_dim, is_train=True)
In [8]:
def model_loss(input_real, input_z, out_channel_dim):
    """
    Get the loss for the discriminator and generator
    :param input_real: Images from the real dataset
    :param input_z: Z input
    :param out_channel_dim: The number of channels in the output image
    :return: A tuple of (discriminator loss, generator loss)
    """
    g_model = generator(input_z, out_channel_dim)
    d_model_real, d_logits_real = discriminator(input_real)
    d_model_fake, d_logits_fake = discriminator(g_model, reuse=True)
    smooth = 0.1
    
    d_loss_real = tf.reduce_mean(
        tf.nn.sigmoid_cross_entropy_with_logits(logits=d_logits_real, labels=tf.ones_like(d_model_real) * (1 - smooth)))
    d_loss_fake = tf.reduce_mean(
        tf.nn.sigmoid_cross_entropy_with_logits(logits=d_logits_fake, labels=tf.zeros_like(d_model_fake)))
    g_loss = tf.reduce_mean(
        tf.nn.sigmoid_cross_entropy_with_logits(logits=d_logits_fake, labels=tf.ones_like(d_model_fake)))

    d_loss = d_loss_real + d_loss_fake

    return d_loss, g_loss


"""
DON'T MODIFY ANYTHING IN THIS CELL THAT IS BELOW THIS LINE
"""
tests.test_model_loss(model_loss)
Tests Passed

优化(Optimization)

部署 model_opt 函数实现对 GANs 的优化。使用 tf.trainable_variables 获取可训练的所有变量。通过变量空间名 discriminatorgenerator 来过滤变量。该函数应返回形如 (discriminator training operation, generator training operation) 的元组。

In [9]:
def model_opt(d_loss, g_loss, learning_rate, beta1):
    """
    Get optimization operations
    :param d_loss: Discriminator loss Tensor
    :param g_loss: Generator loss Tensor
    :param learning_rate: Learning Rate Placeholder
    :param beta1: The exponential decay rate for the 1st moment in the optimizer
    :return: A tuple of (discriminator training operation, generator training operation)
    """
    t_vars = tf.trainable_variables()
    d_vars = [var for var in t_vars if var.name.startswith('discriminator')]
    g_vars = [var for var in t_vars if var.name.startswith('generator')]

    # Optimize
    with tf.control_dependencies(tf.get_collection(tf.GraphKeys.UPDATE_OPS)):
        d_train_opt = tf.train.AdamOptimizer(learning_rate, beta1=beta1).minimize(d_loss, var_list=d_vars)
        g_train_opt = tf.train.AdamOptimizer(learning_rate, beta1=beta1).minimize(g_loss, var_list=g_vars)

    return d_train_opt, g_train_opt


"""
DON'T MODIFY ANYTHING IN THIS CELL THAT IS BELOW THIS LINE
"""
tests.test_model_opt(model_opt, tf)
Tests Passed

训练神经网络(Neural Network Training)

输出显示

使用该函数可以显示生成器 (Generator) 在训练过程中的当前输出,这会帮你评估 GANs 模型的训练程度。

In [10]:
"""
DON'T MODIFY ANYTHING IN THIS CELL
"""
import numpy as np

def show_generator_output(sess, n_images, input_z, out_channel_dim, image_mode):
    """
    Show example output for the generator
    :param sess: TensorFlow session
    :param n_images: Number of Images to display
    :param input_z: Input Z Tensor
    :param out_channel_dim: The number of channels in the output image
    :param image_mode: The mode to use for images ("RGB" or "L")
    """
    cmap = None if image_mode == 'RGB' else 'gray'
    z_dim = input_z.get_shape().as_list()[-1]
    example_z = np.random.uniform(-1, 1, size=[n_images, z_dim])

    samples = sess.run(
        generator(input_z, out_channel_dim, False),
        feed_dict={input_z: example_z})

    images_grid = helper.images_square_grid(samples, image_mode)
    pyplot.imshow(images_grid, cmap=cmap)
    pyplot.show()

训练

部署 train 函数以建立并训练 GANs 模型。记得使用以下你已完成的函数:

  • model_inputs(image_width, image_height, image_channels, z_dim)
  • model_loss(input_real, input_z, out_channel_dim)
  • model_opt(d_loss, g_loss, learning_rate, beta1)

使用 show_generator_output 函数显示 generator 在训练过程中的输出。

注意:在每个批次 (batch) 中运行 show_generator_output 函数会显著增加训练时间与该 notebook 的体积。推荐每 100 批次输出一次 generator 的输出。

In [11]:
def train(epoch_count, batch_size, z_dim, learning_rate, beta1, get_batches, data_shape, data_image_mode):
    """
    Train the GAN
    :param epoch_count: Number of epochs
    :param batch_size: Batch Size
    :param z_dim: Z dimension
    :param learning_rate: Learning Rate
    :param beta1: The exponential decay rate for the 1st moment in the optimizer
    :param get_batches: Function to get batches
    :param data_shape: Shape of the data
    :param data_image_mode: The image mode to use for images ("RGB" or "L")
    """
    input_real, input_z,_ = model_inputs(data_shape[1],data_shape[2],data_shape[3], z_dim)
    out_channel_dim = 3 if data_image_mode == 'RGB' else 1
    d_loss, g_loss = model_loss(input_real, input_z,out_channel_dim)
        
    d_opt, g_opt = model_opt(d_loss, g_loss, learning_rate, beta1)
    
    steps = 0
    with tf.Session() as sess:
        sess.run(tf.global_variables_initializer())
        for epoch_i in range(epoch_count):
            for batch_images in get_batches(batch_size):
                steps += 1
                batch_z = np.random.uniform(-1, 1, size=(batch_size, z_dim))
                
                _ = sess.run(d_opt, feed_dict={input_real: batch_images, input_z: batch_z})
                _ = sess.run(g_opt, feed_dict={input_real: batch_images, input_z: batch_z})
                
                if steps % 100 == 0:
                    train_loss_d = d_loss.eval({input_z: batch_z, input_real: batch_images})
                    train_loss_g = g_loss.eval({input_z: batch_z})
                    print("Discriminator Loss: {:.4f}...".format(train_loss_d),"Generator Loss: {:.4f}".format(train_loss_g))
                    show_generator_output(sess, 25,input_z,out_channel_dim,data_image_mode)

MNIST

在 MNIST 上测试你的 GANs 模型。经过 2 次迭代,GANs 应该能够生成类似手写数字的图像。确保生成器 (generator) 低于辨别器 (discriminator) 的损失,或接近 0。

In [ ]:
batch_size = 32
z_dim = 100
learning_rate = 0.001
beta1 = 0.5


"""
DON'T MODIFY ANYTHING IN THIS CELL THAT IS BELOW THIS LINE
"""
epochs = 2
mnist_dataset = helper.Dataset('mnist', glob(os.path.join(data_dir, 'mnist/*.jpg')))
with tf.Graph().as_default():
    train(epochs, batch_size, z_dim, learning_rate, beta1, mnist_dataset.get_batches,
          mnist_dataset.shape, mnist_dataset.image_mode)
Discriminator Loss: 1.2640... Generator Loss: 0.9028
Discriminator Loss: 1.5097... Generator Loss: 0.5765
Discriminator Loss: 1.2606... Generator Loss: 1.3199
Discriminator Loss: 1.3630... Generator Loss: 0.7210
Discriminator Loss: 1.1573... Generator Loss: 0.8154
Discriminator Loss: 1.5771... Generator Loss: 0.6056
Discriminator Loss: 1.3500... Generator Loss: 0.8662
Discriminator Loss: 1.3284... Generator Loss: 1.1470
Discriminator Loss: 1.2654... Generator Loss: 0.9515
Discriminator Loss: 1.1545... Generator Loss: 1.0450
Discriminator Loss: 0.8857... Generator Loss: 1.5140
Discriminator Loss: 1.6026... Generator Loss: 0.5621
Discriminator Loss: 1.3540... Generator Loss: 2.0724
Discriminator Loss: 1.1835... Generator Loss: 0.7979
Discriminator Loss: 1.0636... Generator Loss: 1.5013
Discriminator Loss: 1.4414... Generator Loss: 2.0219
Discriminator Loss: 1.7075... Generator Loss: 0.3522
Discriminator Loss: 1.2610... Generator Loss: 0.7683
Discriminator Loss: 1.0315... Generator Loss: 1.1801
Discriminator Loss: 1.4299... Generator Loss: 0.5480
Discriminator Loss: 1.0066... Generator Loss: 1.4404
Discriminator Loss: 1.9687... Generator Loss: 0.3190
Discriminator Loss: 0.8074... Generator Loss: 1.6129
Discriminator Loss: 1.2714... Generator Loss: 0.7413
Discriminator Loss: 1.0468... Generator Loss: 1.4351
Discriminator Loss: 0.7599... Generator Loss: 1.5110
Discriminator Loss: 0.7607... Generator Loss: 1.2956
Discriminator Loss: 0.8164... Generator Loss: 2.4285
Discriminator Loss: 0.7931... Generator Loss: 1.4771
Discriminator Loss: 1.1198... Generator Loss: 1.3622
Discriminator Loss: 0.9056... Generator Loss: 1.1954
Discriminator Loss: 0.4735... Generator Loss: 2.2398
Discriminator Loss: 0.7990... Generator Loss: 1.6102
Discriminator Loss: 0.7306... Generator Loss: 1.5000
Discriminator Loss: 1.0193... Generator Loss: 1.3004
Discriminator Loss: 1.0944... Generator Loss: 2.1973
Discriminator Loss: 1.3826... Generator Loss: 1.2417

CelebA

在 CelebA 上运行你的 GANs 模型。在一般的GPU上运行每次迭代大约需要 20 分钟。你可以运行整个迭代,或者当 GANs 开始产生真实人脸图像时停止它。

In [ ]:
batch_size = 16
z_dim = 100
learning_rate = 0.001
beta1 = 0.4


"""
DON'T MODIFY ANYTHING IN THIS CELL THAT IS BELOW THIS LINE
"""
epochs = 1

celeba_dataset = helper.Dataset('celeba', glob(os.path.join(data_dir, 'img_align_celeba/*.jpg')))
with tf.Graph().as_default():
    train(epochs, batch_size, z_dim, learning_rate, beta1, celeba_dataset.get_batches,
          celeba_dataset.shape, celeba_dataset.image_mode)
Discriminator Loss: 1.1524... Generator Loss: 2.0479
Discriminator Loss: 0.7427... Generator Loss: 1.9039
Discriminator Loss: 0.8810... Generator Loss: 1.0912
Discriminator Loss: 1.1570... Generator Loss: 2.5740
Discriminator Loss: 0.8716... Generator Loss: 1.4027
Discriminator Loss: 1.1077... Generator Loss: 0.7669
Discriminator Loss: 1.3907... Generator Loss: 0.9702
Discriminator Loss: 0.8651... Generator Loss: 1.6820
Discriminator Loss: 0.8792... Generator Loss: 1.3750
Discriminator Loss: 1.3186... Generator Loss: 0.9807
Discriminator Loss: 0.8709... Generator Loss: 1.1877
Discriminator Loss: 1.1743... Generator Loss: 0.7949
Discriminator Loss: 0.5258... Generator Loss: 1.8304
Discriminator Loss: 2.2449... Generator Loss: 0.4046
Discriminator Loss: 1.5316... Generator Loss: 0.4384
Discriminator Loss: 1.3727... Generator Loss: 1.0381
Discriminator Loss: 1.2221... Generator Loss: 0.6302
Discriminator Loss: 1.4899... Generator Loss: 0.6116
Discriminator Loss: 0.9308... Generator Loss: 0.7656
Discriminator Loss: 1.1079... Generator Loss: 0.6627
Discriminator Loss: 1.3980... Generator Loss: 0.6407
Discriminator Loss: 1.3282... Generator Loss: 0.5976
Discriminator Loss: 1.5380... Generator Loss: 0.8926
Discriminator Loss: 1.6738... Generator Loss: 0.6926
Discriminator Loss: 1.4807... Generator Loss: 0.7144
Discriminator Loss: 1.4302... Generator Loss: 0.9704
Discriminator Loss: 1.2763... Generator Loss: 1.5093
Discriminator Loss: 1.2392... Generator Loss: 1.0697
Discriminator Loss: 1.5409... Generator Loss: 0.9979
Discriminator Loss: 1.3508... Generator Loss: 0.8025
Discriminator Loss: 1.3750... Generator Loss: 0.8725
Discriminator Loss: 1.4137... Generator Loss: 1.1951
Discriminator Loss: 1.3135... Generator Loss: 0.7202
Discriminator Loss: 1.4068... Generator Loss: 0.7235
Discriminator Loss: 1.2003... Generator Loss: 0.8974
Discriminator Loss: 1.7982... Generator Loss: 0.4167
Discriminator Loss: 1.4422... Generator Loss: 0.8258
Discriminator Loss: 1.3811... Generator Loss: 0.8319
Discriminator Loss: 1.3138... Generator Loss: 0.8780
Discriminator Loss: 1.4926... Generator Loss: 0.9854
Discriminator Loss: 2.0413... Generator Loss: 0.4011
Discriminator Loss: 1.2778... Generator Loss: 0.9066

提交项目

提交本项目前,确保运行所有 cells 后保存该文件。

保存该文件为 "dlnd_face_generation.ipynb", 并另存为 HTML 格式 "File" -> "Download as"。提交项目时请附带 "helper.py" 和 "problem_unittests.py" 文件。